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ABSTRACT 

This paper continues previous reports of a controlled 
multivariate evaluation of a junior high school open-education 
program. A new method of estimating program objectives and 
implementation is presented, together with the nature and degree of 
obtained student outcomes. Open^program students were found to 
approve more highly of their learning environment and to enjoy higher 
self-concepts than the traditional program control students, and at 
no loss in academic achievement. Studies of several student "types" 
showed the open program to be superior for underachievers, 
introverts, and extraverts. The criteria for transition of the 
program from "experimental" to "adopted" are discussed. (Author) 
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Is, "success" will be redefined to expand concepts of responsibility and socia- 
bility and to de-emphasize concepts of order and discipline. The first year of 
this program has clearly demonstrated that academic standards* can be maintained. 
With the teachers' increasing confidence and competence, the potential contri- 
butions of open education to the student's personal arfd social development can 
more meaningfully be assessed." 

The results, conclusions, and implications of .this evaluation were 
subsequently presented to and discussed with the OSCAR teachers; they were en- 
couraged to develop appropriate objectives and strategies. One of the four" 

/' 

OSCAR teachers elected not to continue in the prog^ram and was replaced oy ano- 
ther volunteer. 



SECOND- YEAR EVAIUATION 

Students who had been seventh-graders during the first year of this 
study were, of course, eighth-graders during the second year (1973-7^); and by 
the conclusion of that year, these students had experienced two full years in 
their respective programs (OSCAR or traditional). Neither group, it should be 
remembered, had had any experience with the other group's program. The OSCAR 
eighth-graders, hence, were free from the possible contamination of such exper- 
ience, unlike their previous year's predecessors. 

Further, the OSCAR teachers had had the benefit of another year's exper- 
ience, and the possible morale problems associated with the OSCAR program (Sewel I 
& Dornseif, 197^» p. 22-23) appeared largely to have been resolved. 

The second-year evaluation, then, may be considered more critical than 
that of the first year. With respect to the eighth grade students, analyses of 
their ach ievements and performances would more adequately provide an evaluation 
of the open program's total and general value; those students had originally been 
simply randomly assigned to the program. The second year's seventh-graders. 
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Controlled Multivariate Evaluation of Open Education: 
Application of a Critical Model 
Alan F. Sewell, Allan W. Dornself, and Susan Gross Sturm 
Matteson, Illinois School District l62 

Among the plethora of proposed educational innovations and alterna- 
tives of recent years, the concept of »«open education" has remained remarkably 
viable. A steady stream of open education implementations continues through- 
out the United States and in a number of other lands as well. As noted in a 
previous paper (Sewell & Dornseif, 197^), while the general theory of open edu- 
cation suggests a movement toward certain educational objectives, in practice 
it has been most distinctive as a movement away from traditional educational 
methods. But because implementations of open education have typically been in- 
troduced within existing educational systems at least in the United States — 
a substantial variety of compromises have been effected between the concepts 
and methods of "pure" open education and the concepts and methods of traditional 
education. 

Only quite recently has a literature of open education evaluation begun 
to develop; some of this literature has been previously reviewed (Sewell & Dorn- 
seif, 197^). Such reviews fail to indicate a clear and consistent pattern of re- 
sults; inadequacy of the evaluation design is a common flaw, but in addition 
the question of degree of openness incorporated into any given implementation 
is typically left unanswered. 

In the absence of .any significant standardization among the many* open 
education implementations, each must be individually evaluated. While one pur- 
pose of the present paper is to report the results of an intensive and exten- 
sive, progressive and continuous -- evaluation of one specific program, a more 
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fundamental purpose is to delineate the characteristics and application of a 
critical evaluation model. 

Planning of the educational and evaluation programs began nearly a 
year prior to the inauguration of this open education program. (This program 
was soon characterized by the acronym "OSCAR," both in honor of a previous prin- 
cipal and to represent "Open Space for the Conceptualization of Attitudes and 
Responsibilities"; regardless of its merits, the acronym has persisted, and the 
op^n education program vvi 1 1 be so identified herein.) Certain fundamental ques- 
tions have structured and guided the evaluation design and program, and these 
continue to be pursued: (1) To what extent are the objectives of traditional 
education met within the open education program? (2) What different or supple- 
mentary objectives are met by the open program? (3) Are there types of students 
for whom either the traditional or open program is more suitable? 

It should be noted that these questions focus specifically upon student 
outcomes, and they clearly imply direct comfiarisons between outcomes provided 
within the open education program and those, provi ded by a traditional education 
program. Further, because multiple objectives are to be considered, the use of 
multiple measures is implied. The considerations led to the early adoption of 
a multivariate analysis of variance (MANOV/^,) evaluation model involving both an 
experimental (open education) group and a control (traditional education) group. 

Specific characteristics of the open classroom, the educational pro- 
gram conducted therein, and student selection procedures have been previously 
described in detail (Sewell S Dornseif, 1973; Sewell & Dornseif, 197^; Dornseif 
et al., i97^) and will only be summarized here. The OSCAR (open education) pro- 
gram is conducted in a single, undivided classroom which houses a total of ]kO 
students in a space somewhat larger than that of four traditional classrooms. 
The students include equal numbers of seventh-graders and eighth-graders, boys 
and girls 35 students in each such subgroup; these students have been randomly 
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selected from the District's six sending (primary) schools, in order to^^e- 
present the District's heterogeneity; a control group of simi ]ar>trudents, sim- 
ilarly chosen, pursues the traditional, departmentalized p fog ram of the junior 
high school. 

During the program's second year (1973-7^), the principle of random 
assignment of students was maintained, but both the open and control groups 
were further subdivided into a number of subgroups on the basis of character- 
istics of particular Interest; hence both seventh grade groups Included stu- 
dents characterized by their sixth grade teachers as "ext raverts ,*' "introverts,'* 
"academic underachievers ," or "likely to do well in an open classroom" (for the 
open program) or "likely to do well in ,a traditional program" (for the control 
group); another subgroup was simply ran!sJomly assigned to each program, as in 
the preceding year, (it should be noted\ however that each such subgroup will 
include a very small number of Students. )\ 

Students in the control group fol\low the regular instructional program 
of the junior high school, moving from classroom to classroom through the day, 
instructed by teachers neither required nor^speci f i cal ly encouraged to coordi- 
nate their instructional programs, and only ^in a few instances pursuing sched- 
ules identical to those of other control gro^p members. OSCAR students, on the 
other hand, spend the majority of their days jin the same classroom, with the 
same classmates, and instructed by four teachiers (Language Arts, Mathematics, 
Social Science, and Science) and two teacher-|ai des . Team teaching and close co- 
ordination of instructional programs are emphasized; small and flexible instruc- 
tional groupings and individualization are ertcouraged. 

During the program's first year (1972-73), it should be noted, OSCAR's 
eighth-graders had previously experienced a year in the junior high school's 
traditional program, while OSCAR's seventh-graders had'entered the open program 
directly from a non-departmentalized primary school. The following year (1973- 
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7^), of course, eighth-graders had been the first year's seventh-graders and 
were the first group to have experienced only the open program of the junior 
high school'^ these students, then, are of particular interest in evaluation 
of the open program. While attrition losses were replaced in the OSCAR class- 
room, thev were not included In the evaluation program;' hence the data re- 
ported herein are derived from eighth-graders who have experienced two full 
years of the open program and from seventh-graders who have experienced one 
full year of the open program. 

SUMMARY OF FIRST- YEAR RESULTS 

Evaluations conducted during the program*s first year (1972-73) were 
largely exploratory in nature, intended primarily to identify rather general 
areas of evaluation in which specific outcome differences might be sought. 
Hence during and at the end of that year a large number of different instru- 
ments were administered; certain of these were retained for use in the second 
year's evaluation effort, but others were discontinued. Year-end analyses were 
conducted in accordance with the project's basic MANOVA design. 

First-year results have previously been published in detail (Sewel 1 & 
Dornseif, 197^); these results will merely be summarized here in order to pro- 
vide perspective for the second-year findings. 

A major concern of the evaluation project was, of course, whether the 
OSCAR program would adversely affect academic achievement, which, presumably, Is 
the fundamental objective of traditional education. Analyses of Stanford Achieve- 
ment scores attained by the OSCAR students and the control students satisfactor- 
ily dispelled this concern. The OSCAR group significantly out-performed the con- 
trol group in Social Studies and Science; otherwise the two groups did not dif- 
fer. In Spelling and in Language, seventh grade OSCAR students surpassed the 
achievements of other subgroups. Eighth grade control students, however, excelled 
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in Arithmetic Computation and Arithmetic Concepts. Other multivariate Fs were 
significant as expected: eighth-graders generally outperformed seventh-graders, 
and females outperformed males in those variables demonstrating significant 
differences. 

The Bell Adjustment Inventory (slightly amended t-c allow for the stu- 
dents' age level) showed significant differences in two subscales: Submissive- 
ness-Se 1 f-Assert ion and Mascu 1 i n i ty- Femi n i n i ty, such that the control students 
could be described as more self-assertive and more masculine- Seventh-graders 
of both groups were found to be more self-assertive than eighth-graders. Fe- 
males of both groups were found to be higher in emotionality, .;hile males were 
found (fortunately!) to be higher in masculinity. 

Of the 18 scales of the Ca 1 i fornia Psycho loqi cal I nventory , only one 
was found to differentiate the two groups: Sense of Well-Being; the control 
group mean was higher than the OSCAR mean. 

The (Bell) School I nventory , an att i tude-toward-school instrument, did 
not differentiate the two groups. 

The Piers-Harris Children's Self Concept Scale yielded data showing 
higher se 1 f- concept s on the part of the control students. 

Rotter's Locus of Control measure did not differentiate the two groups. 

**Success" Analyses 

On the assumption that traditional and open education programs, rest- 
ing upon different theoretical foundati ons , would have different objectives (to 
some degree at least), an attempt was made to evaluate each program in terms of 
its objectives. Unfortunately, repeated attempts to secure behavioral objec- 
tives from instructional personnel were unsuccessful; and evaluation efforts 
were redirected toward differentiating between the objectives of both programs. 
A line of reasoning was adopted which held that a teacher^s evaluati^^'n of a 
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student's success wi I ] be a function of the teacher's objectives and of the 
student's attainment of those objectives. Hence study of the teacher's more 
highly evaluated students, in comparison to those receiving lower evaluations, 
should reveal characteristics which contribute to the teacher's objectives. 

Each of the four OSCAR teachers and four teachers of the control stu- 
dents were asked to rate their respective students in each of four rating di- 
mensions (each of which was behavioral ly defined):, attitude, knowledge, skills, 
- and sociability. Each OSCAR teacher's ratings were subsequently converted into 
an Individual £-di stribut ion each student's four ratings provided by that 
teacher were then ^-converted; the four teachers' ^-converted ratings were then 
sumnied, and in this manner each student was assigned a mean ^-rating based 
upon^-1^ individual ratings. The same procedure was followed with respect to 
control St uderrtB-^ra^Ungs by their teachers. 

Students achieving mean rat ings above the median for their group 
were (somewhat arbitrarily) classed as "successful" students, while those be- 
low the median were considered "unsuccessful." Multivariate analyses then com- 
pared "successful" to "unsuccessful" students in each group and more im^ 
portantly attempted to detect significantly different variables as a way of 
defining "success" in the two groups. 

Quite a number of variables were found to differentiate "successful" 
from "unsuccessful" students in both groups, and, in general, there were few 

If 

differences in the patterns for both groups; "successful" OSCAR students were 
found to have a higher sense of communal ity and a more internalized locus of 
control than "unsuccessful" students in that group, while neither of these dis- 
tinctions held for the control group. "Successful" OSCAR students were found 
to differ from thei^* counterparts in the control group only in the OSCAR stu- 
dents' lower self-concepts and lowf:r sense of well-being. "Unsuccessful" 
OSCAR students differed from their control counterparts only in their lower 
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sense of community. In any event, ^'successf u I*' students in both groups vvere 
most clearly differentiated from their "unsuccessful" classmates on the bases 
of academic achievement variables. 

These patterns' rather strqngly suggested that objectives of the OSCAR 
program differed little, if any, from those of the traditional program. This 
conclusion was supported by an additional finding: that intercorrelat ions of 
the OSCAR teachers rat i ngs of their students were not particularly impressive, 
averaging approximately .50. A team- teachi ng approach would seem to imply more 
highly i ntercorre 1 ated ratings, and a similar implication would seem to be 
warranted in the case of a highly individualized program. 

Further analyses of the teachers' use of student ratings revealed the 
existence of certain apparently biasing factors, such that within the OSCAR 
group a "successful" student was most likely to be a seventh grade girl; nei- 
ther grade level nor sex was found to be significantly predictive of "success" 
in the control group. ^ - 

Conclusions and Subsequent Developments 

These findings led to an inevitable conclusion that in practice the 
OSCAR program had not differentiated itself from traditional education; the 
teachers' emphases continued to rest almost totally upon academic achievement. 
Problems of classroom management may well have contributed to the emphases, 
since teachers' comments frequently concerned discipline, order, and noise; 
such difficulties are also suggested by the ratings biases in favor of those 
who would, presumably, be the most cooperative and orderly students. 

The final report of this first-year evaluation noted (Sewel 1 & Dorn- 
seif, 197^, p. 23-2^); "It is to be expected that as their experience accumu- 
lates the OSCAR teachers will develop coping strategies whicS will enable them 
to focus more effectively upon their students' non- cogn i t i ve development; that 
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is, ''success'* will be redefined to expand concepts of respons I b i/1 i ty and socia- 
bility and to de-emphasize concepts of order and discipline. The first year of 
this program has clearly demonstrated that academic standards can be maintained. 
With the teachers' increasing confidence and competence, the potential contri- 
butions of open education to the student's personal aftd social development can 
more meaningfully be assessed." / ^ 

The results, conclusions, and implications of .this evaluation were 
subsequently presented to and discussed with the OSCAR teachers; they were en- 
couraged to develop appropriate objectives and strategies. One of the four' 

/' 

OSCAR teachers elected not to continue in the program and was replaced oy ano- 
ther volunteer. ^ 

/ , 

SECOND-YEAR EVAIUATION 

Students who had been seventh-graders during the first year of this 
study were, of course, eighth-graders during the second year (1973-7^); and by 
the conclusion of that year, these students had experienced two full years in 
their respective programs (OSCAR or traditional). Neither group, it should be 
remembered, had had any experience with the other group's program. The OSCAR 
eighth-graders, hence, were free from the possible contamination of such exper- 
ience, unlike their previous year's predecessors. 

Further, the OSCAR teachers had had the benefit of another year's exper- 
ience, and the possible morale problems associated with the OSCAR program (Sewell 
& Dornseif, 197^, p. 22-23) appeared largely to have been resolved. 

The second-year evaluation, then, may be considered more critical than 
that of the first year. With respect to the eighth grade studentL, analyses of 
their achievement s and performances would more adequately provide an evaluation 
of the open program's total and general value; those students had originally been 
simply randomly assigned to the program. The second year's seventh-graders, 
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having been randomly assigned to subgroups within the larger groups, would, hope- 
fully, provide a basis for evaluation of each program*s special advantages for 
each of these subgroups. In practice, however, due to attrition and absences at 
testing times, the number of students In each of these subgroups is typically 
too small to permit specific subgroup comparisons; hence, most of the following 
analyses will deal with the seventh grade students as if they had been randomly 
assigned to each program without reference to subgroup chracteri st ics. These 
differences between seventh-graders and eighth-graders, nevertheless, recommend 
that data for the two grades be separately considered. 

All of the analyses and results reported here are based upon year-end 
data only: measurements secured during May and June of the program*s second year 
Analyses based upon certain mid-year data of the second year have previously 
been reported (Dornseif et al., 197^). 

Results: Absolute Analyses 

As before (Sewell S Dornseif, 197^), two varieties of analyses were 
conducted:: absolute and relative. The absolute analyses are concerned solely 
with absolute differences between the OSCAR and Control groups. These analyses 
accept some fundamental, underlying differences between the two programs and 
seek to establish not the nature of such differences but the ways in which such 
differences influence student outcomes. Hence the primary focus of these analy- 
ses is outcome measurements (the dependent variables) as functions of the two 
programs (the primary independent variable). Other independent variables of in- 
terest are grade level (seventh vs. eighth) and sex (male vs. female). The 
fundamental design of these analyses, then, is that of a 2 x 2 x 2 MANOVA (or, 
when appropriate, ANOVA) . The results of each analysis are summarized briefly 
be low. 

Stanford Achievement Scales^ On the basis of p^revious experience, only 
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four of the Stanford Achievement scales were administered at year's end; Para- 
graph Meaning. Arithmetic Concepts, Arithmetic Computation, and Science. The 
achievement data employed in this analysis were grade equivalents. The multi- 
variate F for Group (OSCAR vs. Control) was not significant (F = 1.867, _df = 
V208, p>.05). As would be expected, the mu 1 1 i va r i ate £ for Grade was signi- 
ficant (F = 2.810, df = 4/208, p<.03); subsequent univariate analyses showed 
the two grades to difffer s i gn'i f i cant 1 y only i n- Ari thmet i c Computation (£ = 
1.222, p< .04) and Sg^ience (F = 3.400, p<.005), and in both areas eighth- 
graders out-performed seventh-graders. 

The multivariate F for Sex was also significant (F = 7-011, df = 4/208, 
p<.0001). The subsequent univariate Fs were significant for Paragraph Meaning 
(F = 4.409, p< .04) , Arithmeirf-C'Concepts^^tF = 10.801, p<.001), and Arithmetic 
Computation (_F = 3.953. P<.05). In each case, the better performance was 
achieved by female students. 

The mult i variate _Fs for all two-way interactions (Group x Grade, Group 
X Sex, Grade x Sex) failed to attain statistical significance. The three-way 
Interaction (Group x Grade x Sex), however, was statistically significant (mul- 
tivariate f = 2.926, &f = 4/208, p^.03) . The univariate Fs for al 1 of the 
Stanford scales were significant, and the patterns were as would be expected 
from the preceding main effects analyses. 

It should be noted that, in general, cell means for all of these de- 
pendent variables exceeded national norm grade equivalents, and a number of these 
cell means substantially exceeded national norms.^ 

Learning Environment Inventory. As a measure of student attitudes, the 
first-year study had employed The School Inventory , a 100-item inventory de- 
veloped by Bell and yielding a single score. For both practical and theoretical 
reasons, use of th's instrument wao not continued during the second-year evalua- 
tion.; As has been previous'y reported (Dornseif et al,, 197^. pp. the 
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Learning Environment Inventory, developed by Walberg and Anderson (Anderson, 
1973) was selected as s measure of students' attitudes:, first, because of the 
face validity of its multi-dimensional approach; second, because it has been 
employed in a number of studies which appear to support claims of reliability 
and validity (Anderson, 1973); and third, because a majority of the audience 
to whom this report is directed can be expected to have some familiarity with 
the Instrument. Preliminary data, secured by administration of the 

LEI, appeared to demonstrate its usefulness i,. „ .s evaluation program, al- 
though the authors maintained some concern over the appropriateness of its cur- 
rent application. 

Near the end of the year, the LEI was again administered to 103 OSCAR 
students and 107 Control students. Due to the limited capacity of the scoring 
computer, only 98 of the 105 items were scored; the seven items of the Diver- 
sity scale were eliminated since J,e reported reliability of these items was 
lowest . 

Scale scores for the remaining ]k scales were analyzed through the 
study's basic MANOVA design. Of the seven mu 1 t i va r i ate Fs , four were statis-" 
tically significant:. Group, Sex, Grade, and Group x Sex. The results of subse-" 
quent univariate analyses are summarized below. 

The multivariate F for Group was 9.195 (df = IV189, p<.000l). The 
OSCAR mean was greater than the Control mean for three scales: Cliqueness (F = ' 
32.749, p<.0001). Disorganization (F = k.QSk, p<.03), and Democratic (F = 
7.896. p<.006). The Control mean was greater than the OSCAR mean for four 
scales: Formality (F = /..528, p<.Ok), Speed ( F = 19.^^02, p<.00ni). Favoritism 
(F = 9.^11, p<.003), and Apathy (_F = k. I50. p<.05). 

The multivariate F for Sex was 2.O6O (df = lVl89, p<.02). The mean 
for females was greater than the mean for males in the case of three scales; 
Cohesiveness (F = )k.]Ok, p<.00l), Environment (F = )k.080, p<.001). and Goal 
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Direction ( F = 6. 178, p<.02).. Scale means were greater for males than for fe- 
males in three other cases- Favoritism (F = 5.136, p<.03), Disorganization (F 
= 6.403, P<.02), and Apathy (F = 11.0^9. P<.01). 

The multivariate F for Grade was 1 .813 (df = l4. l89, P <.04) . Only 
; sorgani zat ion scale showed univariate significance {F = 4.337, p<.04); 
here the eighth grade mean exceeded the seventh grade mean.; 

The multivariate £ for the Group x Sex interaction was 1.959 (df = 
IV189, p<.03). The scales contributing to this significance were Cohesive- 
ness (£ = 10.642, p<.002). Formality (F = 6.546, p<.02). Environment (f = 
14.3'28, p<;001). Goal Direction (£ = 5.836, p<.02). Disorganization ( F = 
14.884, p<.00l), and Apathy (F = 4.926, p<.03). Di rect iona 1 'di f ferences in 
means are evident from the preceding main effects discussions, 

Piers-Harris Children's Self Concept Scale. As noted previously, 
analysis of first-year data provided by this instrument showed Control students 
to have significantly higher self-concept scores than OSCAR students; this find- 
ing and its implications were sources of considerable concern. Hence the use 
of this instrument was continued in the second-year evaluation. 

This scale yields a single score; the data were, therefore, ANOVA pro- 
cessed. None of the main effects was found significant: Group {F = 2.68, df^ = 
1/209), Grade (F = 1.37), or Sex (F = .21). Of the two-way interactions, only 
Group X Grade was significant (F = 6.30, p<.05)T the F for Group x Sex was 1,.99, 
while the F for Grade x Sex was .31. The three-way interaction. Group x Grade 
X Sex was not significant {F = ^.09). 

Subsequent analyses showed this significant Group x Grade interaction 
to be solely the .result of differences between the seventh grade OSCAR students 
and the othe- groupings. Group means are as follows:; seventh grade OSCAR, 
59.72; seventh grade Control, 53.25; eighth grade OSCAR, 53.36, eighth grade 
Control, 55.70. 
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The eighth-graders of the current study had completed the Self Concept 
Scale as seventh-graders the previous year, so the group means for these students 
were compared. While the mean score for the Control students had remained vir- 
tually unchanged (increasing slightly, from 55.^ to 55.70), the mean for the 
OSCAR students had shown a greater increase (from 51.79 to 53.36). Analysis, 

however, showed these changes to be statistically non- s i gn i f J cant , even when sex 

« 

was considered. The general patterns in these year-to-year changes are seen in 
the following table. 

MEAN SELF CONCEPT SCORES 

197^ 1973 
Seventh grade OSCAR 59.72 51.79 

Seventh grade Control 53.25 55*^^ 

Eighth grade OSCAR 53.36 51.56 

Eighth grade Control 55-70 57.07 

Although analyses of seventh grade subgroups had been intended, the 
number of students in each subgroup was finally too small, and simultaneously 
too disproportionate between subgroups, to permit a reasonable statistical 
analysis. It should be noted, however, that the mean score of each OSCAR sub- 
group exceeded the mean of the corresponding Control subgroup. Further, within 
each major group (OSCAR and Control), the highest mean self concept scores 
were recorded by the "Extraversion'* subgroup, and the lowest mean score by the 
nt roversion** subgroup. Somewhat smaller mean score differences between sub- 
groups were noted in the Control group than among OSCAR students; that is, the' 
Control students appeared generally more homogeneous in self concept than did 
the OSCAR students. 

Results: Relative Analyses ^ 

As had been done previously (Sewell & Dornseif, 197^), teachers asso- 
ciated with both programs were asked to provide individual ratings of each 
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student. In the case of the OSCAR program, each of the four teachers rated a]] 
of the program's students; students in the Control group, however, were widely 
dispersed throughout the Upper Grade Center, and no single teacher wai; acquainted 
with all these students; hence ratings of the Control students were made by quite 
a large number of teachers, each rating a relatively small number of students. 

Teachers were asked to rate each student on four dimensions: Attitude, 
Knowledge, Skills, and Sociability. The following behavioral definitions of 
these rating categories were provided: 

Attitude: Student displays positive attitudes toward school, 
teachers, other school personnel, and other students. 
Knowledge: Student demonstrates mastery of academic content 
appropriate to his/her age, grade level, and apparent abil'ty. 
Skill: Student demonstrates application of academic content 
within school and displays ability to apply academic content 
in non-academic settings. 

Sociability:; Student demonstrates respect for the rights and 
feelings of others and demonstrates abiJity to work effectively 
and cooperatively with others. 
Using a five-point scale, teachers were asked to evaluate each student independ- 
ently of other students and to use each rating dimension independently of the 
others. Previous experience with this rating process suggests that the four 
dimensions probably constitute tv70 rating factors; one described by the Knowledge 
and Skills categories, and the other by the Attitude and Sociability categories. 

The data produced by this rating process, then, consisted of forur 
categorical ratings of each OSCAR and Control student by each of four teachers 
associated with the respective programs. To eliminate biases introduced by the 
teacher's idiosyncratic use of the scales, each teacher's ratings of all stu- 
ints was converted into a z distribution, and each individual rating converted 
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into a correspond! ng 2 ; these were then summed and averaged across rating cate- 
gories to provide a mean z rating of each student for each teacher. Finally, 
the four ^s thereby obtained were averaged to provide a mean z rating across 
categories and across the teachers who had rated the student. The sixteen 
ratings were compressed in this manner to provide a single, minimally biased 
evaluation of each student. 

The final distribution of mean-^ ratings was then divided (separately 
for each group) at the median. For the OSCAR students, this median was .05, 
while the median of the Control group was -.06. In each group students achiev- 
ing ratings greater than the median were arbitrarily classed as ''successful** 
students, while those whose ratings were less than the median were classed as 
"unsuccessful** students* These dichotomizations provided the bases for all the 
**success** analyses reported here. 

The primary purpose of these **success** analyses is to derive a defini- 
tion of ''success** for each program: such a definition would permit inferences 
of the program's object I ves . and would help to identify the distinctive charac- 
teristics of each program. As has been previously noted herein, this analytical 
procedure had prompted the conclusion that during the OSCAR program's first year 
the.object I ves of that program had differed little, if any, from the objectives 
of the traditional program, since profiles of "successful" and "unsucessful" 
OSCAR students were virtually identical to those of their counterparts in the 
Control group. 

I ntercor relat I ons of Teachers' Ra t ings. Since all four of the OSCAR 
teachers had rated the same students, the degree of correlation of each teacher's 
final 2 rating with that of each other teacher was calculated. Because of the 
closeness of the professional relations and the interdisciplinary intent of the 
OSCAR program, a pattern of relatively high and consistent intercorre latlons 
should be expected. These i ntercor re lat i ons are given in the following table. 
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INTERCORRELATIONS OF TEACHERS* RATINGS 
OF OSCAR STUDENTS 

Language Social 

Arts Studies Math Science 



Language 

Arts .632 .577 .608 

Social 

Studies .671 .697 

Math -- .584 



These correlations are generally higher and far more inter-consistent 
than those encountered at ths end of the program*s first year, at which time 
the range was from .395 to .861 and the mean j;; was .524 (Sewell & Tornseif, 
1974). The cbnsfstency of these correlations indicates a substantially greater 
unanimity of the teachers' perceptions. 

Since ratings of the Control students were derived from a large number 
of teachers, each of whom rated only a few students, it was deemed impractical 
to attempt intercorrelations of ratings by those teachers. 

Grade and Sex Factors in Teachers' Ratings. As has been previously 
noted, the first-year evaluation indicated that teachers' ratings were apparently 
strongly influenced by the grade level *and sex of the student, such that in the 
OSCAR program a seventh grade girl was significantly more likely to be rated 
•^successful Hence a similar analysis was conducted with second-year data. 
Means of the z ratings are given in the following table. 



MEAN z RATINGS OF STUDENTS BY TEACHERS 
IN OSCAR AND CONTROL -GROUPS 



OSCAR CONTROL 





Male 


Female 


Male 


Female 


7th Grade 


.06 


.02 


.30 


.32 




(n=31) 


(n=35) 


(n=27) 


(n=32) 


8th Grade 


-.35 


.39 


-.28 


.02 




(n=22) 


(n=26) 




(n=25) 
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These data were employed in a three-way analysis of variance, the re- 
sults of which are summarized in the following table. 



ANALYSIS OF VARIANCE; MEAN z RATINGS 
OF STUDENTS BY TEACHERS 



Source 


SS 


df 


, F 


Grade (g) 


2.226 






Sex (S) 


2.881 






Program (P) 


.2k8 - 




.517 


G X S 


3.779 




7.889^v.v 


GXP 


2.801 




5.8if7^^ 


S X P 


.258 




.538 


G X S X P 


.826 




1 .72^^ 


Error 


102.568 


2]k 


Total 


115.627 


221 










05 








01 



The two programs do not differ significantly in this analysis, of course, by 
virtue of the separate z transformations of the original ratings. Examination 
of the preceding table of cell rtieans shows the directionality of the signifi- 
cant differences. Seventh-graders receive higher ratings than eighth-graders; 
girls receive higher ratings than boys; eighth grade boys" receive the lowest 
ratings; and differences in ratings are most striking among Control group boys. 

Grade and Sex Factors in "Success"^nd "Failure." The preceding 
analysis was not concerned with the rolesV grade and sex in the student's 
assignment to the "successful" and "unsuceesf ul" comparison groups previously 
described. The significance of these factors was tested through a series of 
chi-square analyses, separately performed for each sex, each grade level, and 
each group. Similar analyses of first-year data, it will be recalled, had 
shown seventh grade OSCAR girls to be disproportionately "successful," and this 
finding, in association with other findings, had suggested a higher emphasis 
upon quiet and orderliness In the OSCAR program than appeared in the traditional 
program. Relevant second-year data are shown in the following table. 
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NUMBERS OF STUDENTS RECEIVING MEAN z RATINGS 



AS "SUCCESSFUL" OR "UNSUCCESSFUL" 



OSCAR CONTROL 





Male 


Fema 1 e 


Male 


Fema 1 e 


Successf u 1 


16 


]k 


11 


21 


Unsuccessful 


15 


21 


16 


11 


Successf u 1 


7 


20 


8 


]k 


Unsuccessful 


15 


6 


16 


11 



In OSCAR, sex alone is not a significant contributor to "success" 
= 1.72, df = 1 , p> .05) ; nor is grade level alone Q^^ - 1-28, df = 1, p> .05). 
Coi>5idered simultaneously, however, sex and grade level are significant pre- 
dictors of I'success" (JH = 10.32, df = 3, p<.02)\ despite this significance, 
the strength of this association is not particularly impressive: the Goodman- 
Kruskal index of predict Fve association (Hays, 1963, pp. 6O6-6IO) is .122. As 
the data of the preceding table show, OSCAR seventh grade girls are dispropor- 
tionately "unsuccessful" and eighth grade girls are disproportionately "success- 
ful; somewhat opposite trends hold for the seventh and eighth grade boys, but to 
a lesser extent. 

In Control, Kowever, the simple relation of sex to "success" is sign!- 

2 2 
ficant (X = 6.26, df = I, p<.02), while grade level is not {jH = .92, df = 

I, p>.05). In this group, boys of both grade levels are disproportionately 

"unsuccessful," while girls are disproportionately "successful." 

Stanford Achievement .Scales. Three different analyses of Stanford 
scores (Paragraph Meaning, Arithmetic Concepts, Arithmetic Computation, and 
Science scales) in relation to "success" groupings were conducted: OSCAR "suc- 
cessful" vs. OSCAR "unsuccessful"; Control "successful" vs.. Control "unsuccess- 
ful"; and OSCAR "successful" vs. Control "successfu 1 " All took the form of 
multivariate analyses of variance. 

In the comparison of OSCAR "successful" vs. "unsuccessful" students, 
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the mult I variate f was significant (19.767, df = V99, .0001); subsequent 
univariate analyses showed Significant differences for all four of the scales, 
the higher mean scores in each case having been achieved by the **successf u 1" 
students. 

In the comparison of Control '^successf u 1" vs. ''unsuccessf u 1** students, 
the mult ivariate F was also significant (10.969, df = V98, p<.000l), and the 
univariate £s for each of the scales was significant. Again the higher scores 
were attained by the "successful" students. 

In comparing "successful" OSCAR students to "successful" Control stu- 
dents, the mult ivariate £ was not stat i st I ca 1 ly s i gn i f i cant (1.709, df = 
V103, p>.05).^ Scale means for the OSCAR students were in all cases greater 

than those of the Control students. 

1 

Learning Environment Inventory. As before, only ^k of the 15 scales 
of this inventory were employed. The same multivariate analyses were conducted 
as in the case of the Stanford Achievement Scales. 

In the comparison of OSCAR "successful" to "unsuccessful" students, the 
multivariate F was significant (2.617, df = 1^/85, p<.003). Univariate analyses 
showed "unsuccessful" means to exceed "successful" means on three scales: Speed," 
Favoritism, and Sat i sf act ion^ 

The comparison of Control "successful" and "unsuccessful" students pro- 
duced a significant mu 1 1 i var i ate f (2.^37, df = lV83, p<.006). "Successful" 
students achieved higher means on the following scales: Cohesi veness , Environ- 
ment, Goal Direction, Apathy, and Disorganization. "Unsuccessful" students 
scored higher on Friction, Democratic, and Satisfaction. 

The comparison of "successful" OSCAR students to "successful" Control 
students yielded a significant mu 1 1 1 var i ate F (5-570, df = lV88, p<.O00l). 
OSCAR Students achieved higher mean scores in Difficulty and Democratic, while 
Control students* means were higher on the Formality, Speed, Favoritism, and 
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Cliqueness scales. 

Piers-Harris Children's Self Concept Scale. Mean scores on this instru- 
ment for each cell of the analytical design are shown in the following table. 

MEAN SELF CONCEPT SCORES 

OSCAR CONTROL 
7th Grade 8th Grade 7th Grade 8th Grade 

Successful 63.10 57-37 56.63 59.85 

-(n=28) (n=27) (n=30) (n=2l) 

Unsuccessful 56.7^ ^7.33 ^9.20 52.33 

(n=35) (n=l8) (n-2if) (n=2if) 



Because of the previously , noted disparities in distribution annong cells of the 
larger design, sex was not considered in the analysis of these data. 

The results of a three-way analysis of Variance of self concept scores 
are provided in the following table. The "Rating'\ variable simply refers to 
the "successful" vs. "unsuccessful" categorization. 

ANALYSIS OF VARIANCE: SELF CONCEPT SCORES 



Source 


SS 


df 


F 


Program (P) 


32k. se 


1 


2.361 


Rating (R) 


2^*92.50 


1 


18.135''---'^ 


Grade "(G) 


222.69 


1 


1.620 


P X R 


13.26 


1 


.096 


P X G 


9^6.30 


1 


6.R85^'^"- 


R X G 


120. 1 1 


1 


.873 


P X R X G 


71B.69 


1 


5.229^v 


Error 


27351.11 


199 




Total 


32189.22 


205 





^vp < . 025 
•A-Vrp <.01 
^!r:r:cp ^.0005 



In order to explore contributions to the obviously strong relation be- 
tween teachers' ratings and students' self concept scores, a series of Pearson 
£S were computed. Differences In these coefficients will, of course, aid in 
understanding the interaction patterns reported above. These correlations are 
reported in the following table. 
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CORRELATIONS BETWEEN MEAN z RATINGS 
AND SELF CONCEPT SCORES 



OSCAR 


CONTROL 


.356 


.317 


(n=108) 


(n=99) 


.397 


.267 


(n=63) 


(n=5^) 


.325 


'ASS 






-.3^7 


-.192 




(n=it7) 


.350 


.385 


(n=59) 


(n=52) 
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Total Group 
7th Grade 
8th Grade 
Males 

Fema les 



DISCUSSION 

AS noted previously, the two varieties of analyses reported here -- ab 
solute and relative - had two different but complementary purposes. The abso- 
lute analyses, of course, were intended simply to discover any absolute differ- " 
ences in outcomes of the two programs. The relative analyses were intended to 
explore and define differences in the natures of the two programs. Two educa- 
tional programs may yield quite different outcomes despite any essential differ- 
ences in the programs themselves; and such different outcomes might then be at- 
tributed to differences in the students, in the teachers, in peculiar interac- 
tions of students and teachers, in differing physical environments, etc.. etc. 
outcome differences stemming from inherent differences in students participating 
In each program have specifically been eliminated from this evaluation design by 
randlm assignment of students,to the two programs; nevertheless, no similar con- 
trols were possible or practical with respect to the programs' teachers or 
the physical environment in which each program is conducted. Hence a design 
limited to absolute analyses of student outcomes necessarily begs the question 
of whether the programs are truly different. 
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Many innovative programs fail to achieve their stated objectives, while 
others achieve certain objectives only at the expense of other objectives. An 
early and very serious concern of thoSs2 associated with the OSCAR program was 
whether traditional educational objectives might not be relegated to a subordi- 
nate status in the pursuit of wider, non- t radi t i ona 1 objectives. Hence academic 
achievement of the two groups has been measured at every stage of the evaluation. 
And it gratifying to find that whatever else the OSCAR program has achieved, the 
high educational standards of the District have not been sacr'ficed: the academic 
achievements of OSCAR students do not differ from those of students in the tra- 
ditional program;' indeed, if anything, there appears to be a slightly higher 
(statistically non-significant) level of. academic achievement in the OSCAR pro- 
gram. 

Overall, the findings of the Learning Environment Inventory are equally 
gratifying. OSCAR students perceive their environment as more clique-ish, more 
disorganized, and more democratic; in view of program and physical environmental 
differences in the two programs, such differences are quite appropriate. The 
perceptions of disorganization can easily be attributed to the sheer number of 
students housed in a single, large classroom. Higher Cliqueness scores probably 
' represent both a normal clustering of students within the physical environment 
and the teachers' efforts to establish small teaching and learning groups. That 
cl ique- ishness is not necessarily adversely evaluated is substantiated by the 
students' perceptions of the environment as more democratic. 

Complementing these interpretations. Control students perceive their 
environment as more formal, more speed-oriented, more tainted by favoritism and 
more characterized by apathy. The c 1 ique- i shness of the OSCAR program is par- 
alleled by the favoritism found in the Control program. That is, the existence 
of cl iques within the OSCAR program is apparently not a product 6f teacher fa- 
vor! t i sm. 
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The first-year finding of higher self-concept scores for Control stu- 
dents was a source of particular concern. This trend was clearly reversed during 
the OSCAR program's second year. It well may be that the OSCAR teachers' in- 
creasing experience and confidence were significant factors in this reversal, 
since it is most notable amongst the seventh-graders of that program and consti- 
tutes the major difference in self-concept scores during the program's two 
years of operation. 

The major conclusion of the, first-year evaluation of th% OSCAR program 
was that the absence of student outcome differences was most likely due to the 
absence of differences in the two educational programs; that is, the OSCAR pro- 
gram evi dent 1 y had not at that time established a separate identity. This con- 
clusion was based upon an inability to find consistent and significant differ- 
ences through the various relative analyses reported (Sewell & Dornseif, 197^). 
The data of the present report, however, indicate rather clearly that such a 
distinctive identification has been established. A much greater uniformity of 
perceptions of their students by the OSCAR teachers suggests that a consistent 
basis for such perceptions has been achieved. Similarly, i ntercorre lat ions of 
Stanford Achievement scores within the OSCAR program are consistently higher-- 
than similar i ntercorre lat ions within the traditional program- and, indeed, 
consistently higher than similar i ntercorre lati ons derived from the national 
standardization sample for these tests. Taken together, these two factors indi- 
cate a substantial degree of i nterdi scip 1 inary agreement and coordination, evi- 
dently a necessary ingredient of an OSCAR-type program. 

That the two programs differ is also attested by the differing rela- 
tions of teachers' ratings to student characteristics in the two programs. In 
the traditional program, this relation is fairly simple: girls ^re si gni f icanMy.. 
more likely to receive higher ratings than boys. In OSCAR, however, ratings of 
^irls appear to increase substantially between the seventh and eiohth grades, 
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while ratings of boys show' complementary decreases. (Incidentally, the consist- 
ently lower ratings of eighth grade boys has been repeatedly observed and has 
been verbally corifirmed by other educators; this evaluative phenomonon appears 
to be widespread and may be related to the conset of male adolescent behavior 
pat terns. ) 

Hence in terms of these ratings alone, the two programs appear to have 
distinctive identities. The remaining analyses were intended to establish the 
nature of these differences. It is quite clear from the data that academic 
achievement is equally important to "success" in both programs. 

In terms of the scales of the Learning Environment Inventory, "success- 
ful'* OSCAR students perceive their environment as characterized by difficulty 
and democracy, while their counterparts in the Control group find their environ- 
ment characterized by formality, speed, favor i t i sm,'^ and cl i que- i sKness . Within 
the OSCAR program "success*' seems to be related to an environment characterized 
by less speed, less favoritism, and less satisfaction. Within the traditional 
program "success" is related to an environment characterized by more cohesiver 
ness, a more enjoyable physical environment, greater goa K di rect i on, more 
apathy, more disorganization, less friction, less democracy, and less satisfac- 
tion. The relation of "success" to less satisfaction in both programs poses 
some challenges to further research. 

A conceptual dissatisfaction with the nomi/ial scales of the LEI (and 
their sheer number) led to a still-continuing factor analytic evaluation of 
this instrument. Using item scores as the raw data of this analysis of 98 items, 
comb i n'^.d- group data (that is, derived from both OSCAR and Control students) 
failed to yield any easily understandable factors, despite various types of ro- 
tation$. When, however, data from each group were separately analyzed, two or 
three (not 1^) reasonably coherent factors did emerge from each analysis, and 
these factors were conceptually quite different nearly complementary -- for 
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each group. The existence of these disparate factor structures within each 
group can be taken as further evidence of the distinctiveness of each program. 
Deta.ils of these factor analyses will be reported at a later time, but in the 
meantime some interpretive caution with respect to the LE! scales seems appro- 
priate. 

Analyses of the self-concept scale scores show a high se 1 f-^concept to 
be closely related to »»success" in both programs. The three-way interaction 
(program x rating x grade) noted in these analyses reflects both the generally 
high self-concept scores of OSCAR seventh-graders, and the anomalous negative 
correlations between self-concept and teachers' ratings amongst males in general 
^ and amongst eighth grade Control students in particular. 

Unfortunately, as noted previously, the small nuipbcrs of students in 
each program's seventh grade subgroupings precluded detailed statistical analy- 
ses. Inspection of subgroup means, however, indicates data trends generally 
consonant with whole-group results. To some extent the data suggest that the 
five nominal subgroups (Random Ass i gnment , Underachiever, , I ntrovert , Extravert, 
and Teacher Nominated) may be functionally described as really only two sub- 
groups: Introvert (including, generally, the Underachiever and Introvert Sub- 
groups) and Extravert (including, generally, the Random Ass i gnment , Extravert, 

0 

and Teacher Nominated subgroups). Typically, the Introvert group has, of course, 
lower mean self-concept scores and lower teachers* ratings, while opposing ten- 
dencies are evident in the data of the Extravert group. 

Overall, the results of this second-year evaluation demonstrate that the 
OSCAP program has established a separate and distinctive identity, that that 
identity broadly conforms to commonly expressed characterizations of "open edu- 
cation '' and that student outcomes of this program are in the desirable direc- 
tions with respect to both academic and non-academic achievements. The some- 
what negative tendencies of the program noted as a result of the first-year 
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evaluatlon have been either eliminated or substantially ameliorated. The OSCAR 
program appears to have established its educational feasibility and value., and 
its continuation (and continued evaluation) has been recommended. 

..Finally, the differences between the results of the first-year evalu- 

r 

ation and the present evaluation strongly support the need for (1) early plan- 
ning of evaluation, (2) development and continued use of an appropriately cri- 
tical evaluation model, and (3) continuous and continued evaluation, rather than 
premature acceptance or rejection of a program on the basl's of premature or 
hasty evaluation. * 
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